Comparative analysis of periodicity search methods in DNA sequences
نویسندگان
چکیده
To determine the periodicity of a DNA sequence, different spectral approaches are applied (discrete Fourier transform (DFT), autocorrelation (CORR), information decomposition (ID), hybrid method (HYB), concept of spectral envelope for spectral analysis (SE), normalized autocorrelation (CORR_N) and profile analysis (PA). In this work, we investigated the possibility of finding the true period length, by depending on the average number of accumulated changes in DNA bases (PM) for the methods stated above. The results show that for periods with short length (≤4 b.p), it is possible to use the hybrid method (HYB), which combines properties of autocorrelation, Fourier transform, and information decomposition (ID). For larger period lengths (>4) with values of point mutation (PM) equal to 1.0 or more per one nucleotide, it is preferable to use information of decomposition method (ID), as the other spectral approaches cannot achieve correct determination of the period length present in the analyzed sequence.
منابع مشابه
تخمین مکان نواحی کدکننده پروتئین در توالی عددی DNA با استفاده پنجره با طول متغیر بر مبنای منحنی سه بعدی Z
In recent years, estimation of protein-coding regions in numerical deoxyribonucleic acid (DNA) sequences using signal processing tools has been a challenging issue in bioinformatics, owing to their 3-base periodicity. Several digital signal processing (DSP) tools have been applied in order to Identify the task and concentrated on assigning numerical values to the symbolic DNA sequence, then app...
متن کاملA comparative phylogenetic analysis of Theileria spp. by using two two "18S ribosomal RNA" and "Theileria annulata merozoite surface antigen" gene sequences
More than 185 species, strains and unclassified Theileria parasites are categorized in the Entrez Taxonomy. The accurate diagnosis and proper identification of the causative agents are important for understanding the epidemiology, prevention and appropriate treatment. This study aims to discuss the importance of two genes of Theileria annulata 18S ribosomal RNA (18S rRNA) and Theileria annulata...
متن کاملSearch and classification of potential minisatellite sequences from bacterial genomes.
We used the method of Information Decomposition developed by us to identify the latent dinucleotide periodicity regions in bacterial genomes. The number of potential minisatellite sequences obtained at high level of statistical significance was 454. Then we classified the periodicity matrices and obtained 45 classes. We used the other new method developed by us--Modified Profile Analysis--to re...
متن کاملDevelopment of an Efficient Hybrid Method for Motif Discovery in DNA Sequences
This work presents a hybrid method for motif discovery in DNA sequences. The proposed method called SPSO-Lk, borrows the concept of Chebyshev polynomials and uses the stochastic local search to improve the performance of the basic PSO algorithm as a motif finder. The Chebyshev polynomial concept encourages us to use a linear combination of previously discovered velocities beyond that proposed b...
متن کاملDetecting periodic patterns in biological sequences
MOTIVATION The search for repeated patterns in DNA and protein sequences is important in sequence analysis. The rapid increase in available sequences, in particular from large-scale genome sequencing projects, makes it relevant to develop sensitive automatic methods for the identification of repeats. RESULTS A new method for finding periodic patterns in biological sequences is presented. The ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید
ثبت ناماگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید
ورودعنوان ژورنال:
- Computational biology and chemistry
دوره 53 Pt A شماره
صفحات -
تاریخ انتشار 2014